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FFTS-C - A FAST FOURIER TRANSFORM SUBROUTINE 
FOR COMPLEX DATA 


DECUS Program Library Write-up DECUS No. 8-144 

1. ABSTRACT 

The Fast Fourier Transformation enables computation 
of the power spectrum of a time series in a minimum of 
time. Specifically, it reduces the number of computations 
required to calculate the Discrete Fourier Transformation 

of a series of N equally time spaced samples 
X N _i where N is a power of 2(N=2 n ). In fact, for 1024 
time samples, computation time is reduced by over 99%. 

FFTS-C (for Fast Fourier Transformation Subroutine) 
will transform up to 1024 complex points. It is written 
as a subroutine, and is I/O independent. The user must 
tailor his own input-output procedure to his particular 
environment. 

2. REQUIREMENTS 

2.1 Hardware 

A 4K PDP -8 with Extended Arithmetic Element Type 182 
or a PDP- 8 /I with EAE Type KE- 8 /I option is the 
minimum necessary hardware. 

2.2 Storage 

FFTS requires locations 3 to 12, 20 to 55, 400 to 
1577+N, and 3600 to 3577+N, where N is the octal 
number of points being transformed. 

3. LOADING PROCEDURE 

Make sure the BIN Loader is in core. If not, load it. 

Put 7777 in the SR. Press Load Address. Place FFTS on the 
reader and turn the reader on. Press start, and FFTS will 
load. Then load the user's program the same way as above 
and start it. 

4. USAGE 

4.1 Calling Sequences 

FFTS enables the user to take either the Fast Fourier 
Transform, (FFT) or its inverse (IFFT) of a complex 
time series. The subroutine FFT, which begins at 0400, 
calculates the FFT. Register DOFFT (normal location 0043), 
points to FFT, so a JMS I DOFFT (=4443) will call FFT. 

The subroutine \IFFT beginning at 0756 takes the inverse 
FFT. Since location DOIFFT (normally 0044) points to 
IFFT, IFFT can be executed simply by writing JMS I DOIFFT 
(= 4444 ). Both FFT and IFFT assume that the complex data 











to be handled has already been stored in memory 
(see sections 5 and 6.2). After the operation 
is complete, the results will be stored in 
memory in bit inverted order (see section 5 . 1 ). 

For FFT, the results are the complex co-efficients 
Sj (with the appropriate scale factors, as described 
in section 5.2) given by the equation in section 
1 (j=0, 1,...,N-1). For IFFT the results consist of 
a time sequence Xj (j=0,1,....,N-1). 

An example of a program that will transform a time 
series and then resynthesize the series from its 
spectrum is as follows; (see sections 5 and 6). 

*200 

/INPUT DATA AND ZERO IMAGINARY PARTS 
BEGIN, 


• 

/SERIES STORED AWAY 
JMS I DOFFT 
TAD SCALE 
DCA SCALT 
JMS I SORT 
JMS I DOIFFT 
TAD SCALE 
TAD SCALT 
DCA SCALE 


/TAKE FFT ~~— 

/GET SCALE FOR TRANSFORM - " 

/RE-ORDER THE TRANSFORMS 
/TAKE THE INVERSE 
/GET SCALE ON INVERSE 

/RESULTS =1 ORIGINAL DATA)*2 +( SCALE^ 


/OUTPUT RESULTS (NOW STORED IN BIT REVERSED ORDER) 


END, JMP BEGIN /START AGAIN 

SCALT, 0 

DOFFT = 43 

SORT = 37 

DOIFFT= 44 
SCALE = 50 

$ 

N0TE: THE REMARKS in THE FOLLOWING SECTIONS APPLY TO 
I FFT AS WELL AS FFTl - 

4.2 Execution Times 

The following is a table of execution times for the 
subroutine. 

Number of points transformed Time (seconds) 


1024 


4.47 











Number of points transformed 


Time (Seconds) 


512 
256 
128 

5. DETAILS OF STORAGE 
5.1 Data Storage 

A JMS I DOFFT causes a complex time series to be Fourier 
Transformed. That series is stored in sequential order in 
memory. More explicitly, the real parts of the data 
are stored sequentially after location XRTAB (=1600) 
and the imaginary parts are placed after location XITAB 
(= 3600 ). For example, the storage scheme for a N=4 
point transform would look as follows? 


/RE( ) DENOTES REAL PART. 


/IM( ) DENOTES IMAGINARY PART 


*1600 

XRTAB, RE(X n ) 

RE(X X ) 

re(x 2 ) 

re(x 3 ) 

*3600 

XITAB, IM(X 0 ) 

IM(Xi) 

IM(X 2 ) 

im(x 3 ) 


1.96 

.845 

.357 


On exit the results of the transformation will be in 
core. The real parts of the transforms (Fourier co¬ 
efficients) are stored in the registers following XRTAB, 
and the imaginary parts are stored in the locations 
following XITAB. But the transforms are stored in bit 
reversed order. This means that to find S say, the 
order of the bits of j, written in binary, J must be 
reversed. For example, to locate Sj in memory after a 
16 point (N=16, n=4) transformation has been completed, 
first write j=5 as a binary number of n=4 bits? j=0101 2 
and then reverse the order of the bits, giving 1010 2 , * 
which is This means the real part S 5 is stored 

in the position where X^ was originally placed. In 
memory this is location'XRTAB+'S . Because the user can 
save time by fetching the transforms for output from bit 
reversed order, the subroutine does not bother to re¬ 
shuffle them in memory before exiting. However, a 
subroutine SORT that reshuffles the co-efficients is 
provided, and may be called by a JMS I SORT (SORT=37). 

5.2 Data Scaling 

All calculations in FFTS are done with single precision 
fixed point signed binary fractions. The binary point 
is located between bit 0 and bit 1, leaving an 11 bit 
signed mantissa. Bit 0 is used as a sign bit. Negative 
numbers are formed by taking the two's complement of 
the positive binary fraction. So all inputs must be 
scaled in magnitude to less than one. The outputs are 
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also formatted as above. There is also a more 
subtle scale factor involved, in order to utilize 
the maximum number of bits in the transformation 
it is sometimesn ecess ary to divide by 2 in a com¬ 
putation. As a result of this 

a pseudo floating point format has been adopted in 
which a variable scale factor (or exponent) is im¬ 
posed on all the Fourier co-efficients. This scale 
factor or pseudo exponents is found in register 
SCALE (=50) after each transform has been completed. 

The numbers stored in memory are the Fourier co¬ 
efficients multiplied by 2 raised to the contents of 
SCALE. So to retrieve the co-efficients themselves, 
merely shift each number C(SCALE) places right. If 
any further computations are to be done, better 
accuracy will be obtained by retaining the pseudo 
exponent and leaving the co-efficients in “normalized 
form." in the case of the inverse transform, the 
desired results (here time samples) are the numbers 
stored in memory times 2f(-C(SCALE)) . In the 
program example of section 4.1 the scale factors 
after the transform and the inverse are saved, and later 
added. This is necessary because the inverse routine 
calls the transform routine, which adopts a “floating" 
point format. Hence the results of the inverse of 
the transform have to be scaled by the sums of the 
separate scaling factors. 


6 . RESTRICTIONS 


6.1 Program Initialization 


Because FFT is a subroutine certain registers must 
be primed before the first entry in order to insure 
proper operation. Specifically, register N (location 
0003) must contain the number of points being trans¬ 
formed (in octal, of course) and register NU (location 
0004) must contain the power of two which N is, that 
is, C(N)=2t'C (NU) .C(NU) must be at least 2 and no more 
than 12 g, due to memory limitations. 


6.2 Data Storage 


FFT assumes that the complex time series has been 
stored in memory. So the locations after XRTAB must 
contain these time samples (actually, their real parts). 
If the time series is complex valued, the imaginary 
parts must be stored in the registers after XITAB. If 
the series is only real valued, then the N-l registers 

after XITAB must be set to zero. The reason FFT does not 

do this itself is simply because that would not allow 
for the possibility of a complex transformation. 
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6.3 


Input Restrictions 


So as to prevent overflow of the single precision 
storage, it is absolutely necessary that all data be 
less than 1 in magnitude , subject to the format 
described in section 5.2. (The binary point is to the 
right of bit 0). 

7. METHODS 

7.1 Algorithm 


FFTS uses the algorithm discovered by Cooley and Tukey 
for the rapid computation of a spectrum. This al¬ 
gorithm, called the Fast Fourier Transformation (or FFT), 
permits transformation of N (which must be an integer 
power of 2) equally spaced time samples in a time 
proportional to Nlog 2 lf 7 ~whereas previous methods"re¬ 
quired times proportional to N . This gives a reduc¬ 
tion of l-log 2 N/&. For N=1024, this is over 99%. In 
essence, the algorithm makes use of the fact that 

v k_ w CKm«aiO 

- 2 ni/n 

(where W=e ) to reduce the number of multiplications 

necessary for a transformation. A complete description 
and proof of the algorithm used and its implementation 
can be found in an article by James Rothman which appears 
in DECUSCOPE, Volume 7, Number 3. 


8. DETAILS OF OPERATION 


The following is a list of useful subroutines and 
their operations: (values of the symbols may be found in 
the symbol table included in this document). 


Name Call By 

FFT JMS I DO FFT 

IFFT JMS I DOIFFT 

SORTX JMS I SORT 


Function 


Takes the Fourier Transformation 
of the data buffer. Results in 
bit reversed order. 

Takes the Inverse Fourier Trans¬ 
formation of the data buffer. 
Results in bit reversed order. 

Sort the data buffer so that it 
is in normal sequence. 
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Name 


Function 


Call by 
TRIGET JMS I GETRIG 


INVRT 


JMS I INVERT 


MULTIP 


JMS I MULT 


9. SYMBOL TABLE: 

A symbol table follows: 


Fetches sine and cosine values. 
Specifically, if the AC=K on entry, 
the values of sin (2 jtk/!j) and cos 
(2;rK/to) are fetched from an in¬ 
ternal trig table, k must be 
<or=N/2. A register COSINE contains 
the cosine value and the AC contains 
the sine value on exit. 

Number in AC is bit reversed and 
the result is in the AC on exit. 

Rounded single precision signed 
multiply, uses EAE. AC=multiplier. 
C(Call address + l)=address of 
multiplicand. Result in AC on exit. 
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symbol 

1 Ad„t 

SYMBOL 

T ABLE 

a joer 

0036 

g 

0024 

AUDR 

1134 

Ul 

0021 

AUOrtOS 

1136 

Ur 

0020 

AUDI 

1173 

UUAOl 

1110 

AuD2 

0030 

JUAU2 

1072 

aujsg n 

0567 

RdUlLU 

0700 

AHG2 

1017 

KLCH6 

3702 

asr 

7415 

RtSETC 

3701 

6IGSNU 

0012 

RtVERS 

0707 

BUILO 

0543 

S 

0006 

c 

0027 

SU A 

7441 

GAM 

7621 

SUALE 

3050 

CC I A 

0767 

SoL 

7403 

CrtKPT 

0514 

StTC 

0541 

CMOP 

0770 

SGNAOJ 

0766 

C !M 0 T S 

0676 

SHFCHK 

0052 

GOSIML 

0033 

shflag 

0051 

OATAHI 

5600 

smfti 

1077 

GUFF T 

0043 

SmF T2 

1114 

UUIF FT 

0044 

SHFT3 

1125 

U V I 

7407 

SB IF Cl 

3561 

F 

0007 

SHIFT1 

0053 

FFT 

0400 

SrilF T2 

3054 

FLIP 

1044 

Sh IF T 3 

0055 

FLIPCT 

1060 

Sml 

7413 

GtyKIG 

0042 

sign 

1035 


3035 

SINL 

0332 

GH 

0034 

S1NL0C 

0045 

IF FT 

0756 

S1NRET 

1122 

1MDEX 

1133 

sintab 

1175 

1MVERT 

0040 

SURT 

0037 

IMVRT 

1036 

SORT X 

0/03 

K 

0026 

SHAPED 

0747 

L 

0005 

TfcMPR 

0031 

LUOP1 

0440 

IRI GET 

1061 

LSR 

7417 

RURG 

1056 

MAXnU 

0011 

WURJP 

1^57 

MMQVR2 

0012 

XiTAB 

3630 

Mg a 

7501 

xlocdf 

0047 

mul 

7421 

XrtLUC 

0046 

mult 

0041 

XRTA3 

1600 

mult IP 

1000 

XBUtv) 

1174 

muy 

7405 



M 

0003 



MM I 

7411 



MUROT 

3563 



MUTNOR 

1171 



MUVER4 

0010 



NU4MIK 

1132 



MU 

00 Z 4 



P 

0025 



PI 

0023 



PR 

0022 
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ADDENDUM TO 8-143 and 8-144 


The program was structured so to make the change of eliminating 
the EAE requirement with a minimum of effort. 

All that need be done is replace each EAE instruction with a 
subroutine that performs the given operation using a pseudo 
multiplier-quotient. For this purpose the EAE simulator may be 
used. This does not allow certain microcodes, and where these 
occur in the FFT program, they can be separated into groups of 
EAE instructions, all of which together perform the designated 
function. 

For example CLA MQL MUY (microcoed) could become the three 
instructions: 

CLA 

MQL 

MQA. 
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CORRECTION TO DECUS NO. 8-143 AND 8-144 


ORIGINAL 
*1000 
MULTIP, 0 


RAL 

DCA SIGN 
MUY 

ARG2, HLT ARG2 

SHL 

0 

DCA ARG2 
SHL 
0 

MQL 

TAD SIGN 
CLL RAR 
TAD ARG2 
MQA 
SZL 

CMA IAC 
JMP I MULTIP 

SIGN, 0 SIGN, 


CORRECTED CHANGE 

*1000 

MULTIP, 0 


RAR 

DCA SIGN 

MUY 

HLT 

SHL 

0 

DCA ARG2 

* 

TAD SIGN 

* 

SHL 

★ 

0 

* 

TAD ARG2 

* 

SPA 

* 

CLA CLL CMA RAR 

* 

NOP 

SZL 

CMA IAC 

JMP 1 MULTIP 

0 

* 


The error was in the way in which rounding was accomplished. This fix was tested by performing a 
DOFFT, SORT, DOIFFT, SORT sequence on a 512 point real valued time series with 8-144 and then 
summing the absolute value of the imaginary residuals. The fix above reduced the sum by 40 percent. 





